SODA: An Optimizing Scheduler for Large-Scale Stream-Based Distributed Computer Systems
نویسندگان
چکیده
This paper describes the SODA scheduler for System S , a highly scalable distributed stream processing system. Unlike traditional batch applications, streaming applications are open-ended. The system cannot typically delay the processing of the data. The scheduler must be able to shift resource allocation dynamically in response to changes to resource availability, job arrivals and departures, incoming data rates and so on. The design assumptions of System S , in particular, pose additional scheduling challenges. SODA must deal with a highly complex optimization problem, which must be solved in real-time while maintaining scalability. SODA relies on a careful problem decomposition, and intelligent use of both heuristic and exact algorithms. We describe the design and functionality of SODA, outline the mathematical components, and describe experiments to show the performance of the scheduler.
منابع مشابه
Job Admission and Resource Allocation in Distributed Streaming Systems
This paper describes a new and novel scheme for job admission and resource allocation employed by the SODA scheduler in System S . Capable of processing enormous quantities of streaming data, System S is a large-scale, distributed stream processing system designed to handle complex applications. The problem of scheduling in distributed, stream-based systems is quite unlike that in more traditio...
متن کاملOptimizing Teleportation Cost in Multi-Partition Distributed Quantum Circuits
There are many obstacles in quantum circuits implementation with large scales, so distributed quantum systems are appropriate solution for these quantum circuits. Therefore, reducing the number of quantum teleportation leads to improve the cost of implementing a quantum circuit. The minimum number of teleportations can be considered as a measure of the efficiency of distributed quantum systems....
متن کاملQOS based user driven scheduler for grid environment
As grids are in essence heterogeneous, dynamic, shared and distributed environments, managing these kinds of platforms efficiently is extremely complex. A promising scalable approach to deal with these intricacies is the design of self-managing of autonomic applications. Autonomic applications adapt their execution accordingly by considering knowledge about their own behaviour and environmental...
متن کاملE2DR: Energy Efficient Data Replication in Data Grid
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...
متن کاملKnowledge-Based Scheduling Resource into Grid System by Second-Pric
In recent years, grid computing systems have become popular for the resolution of large-scale complex problems in science, engineering and industry. In order that grid computing focus on scalability of high system and also on large-scale resource sharing, an effcient resource management system is crucial for the efficacy of the system. However, providing effective scheduling and resource alloca...
متن کامل